Ramped Half-n-Half Initialisation Bias in GP
نویسندگان
چکیده
Tree initialisation techniques for genetic programming (GP) are examined in [4,3], highlighting a bias in the standard implementation of the initialisation method Ramped Half-n-Half (RHH) [1]. GP trees typically evolve to random shapes, even when populations were initially full or minimal trees [2]. In canonical GP, unbalanced and sparse trees increase the probability that bigger subtrees are selected for recombination, ensuring code growth occurs faster and that subtree crossover will have more difficultly in producing trees within specified depth limits. The ability to evolve tree shapes which allow more legal crossover operations, by providing more possible crossover points (by being bushier), and control code growth is critical. The GP community often uses RHH [4]. The standard implementation of the RHH method selects either the grow or full method with 0.5 probability to produce a tree. If the tree is already in the initial population it is discarded and another is created by grow or full. As duplicates are typically not allowed, this standard implementation of RHH favours full over grow and possibly biases the evolutionary process. The full and grow methods are similar algorithms for recursively producing GP trees. The full algorithm makes trees with branches extending to the maximum initial depth. The grow algorithm does not require this and allows branches of varying length (up to the maximum initial depth). As many more unique trees exist which are full (as full trees contain more nodes), there is a tendency, especially with particular function and terminal sets, to produce more duplicate trees with the grow method. To estimate the bias of the RHH method with a particular function and terminal set, we use the results from Luke [3] (Lemma 1). The expected number of nodes Ed at depth d is defined as: Ed = {1 if d = 0, else Ed−1pb if d > 0} where pb is the expected number of children of a new node (p is the probability of picking a nonterminal, and b is the expected number of children of a nonterminal). Luke [3] uses this to calculate the expected size of trees in the infinite case. Here we bound the calculation to depth d = 4. For our analysis, Ed correctly predicted the full method would contribute more trees to the initial population whenever the expected size of the two methods was not similar (i.e. grow made smaller trees, causing more duplicates and rejected trees). Canonical GP trees grow in size to maximum depth limits, making the initial trees seeds for the evolutionary process. As the full algorithm is more likely to evolve larger and more bushier trees with more nodes than the grow method, we conduct an experimental study to observe these differences in the evolutionary
منابع مشابه
Size Fair and Homologous Tree Crossovers
Size fair and homologous crossover genetic operators for tree based genetic programming are described and tested. Both produce considerably reduced increases in program size (i.e. less bloat) and no detrimental e ect on GP performance. GP search spaces are partitioned by the ridge in the number of program v. their size and depth. While search e ciency is little e ected by initial conditions, th...
متن کاملGECCO - 99 : Proceedings of the Genetic and Evolutionary Computation Conference
Size fair and homologous crossover genetic operators for tree based genetic programming are described and tested. Both produce considerably reduced increases in program size and no detrimental eeect on GP performance. GP search spaces are partitioned by the ridge in the number of program v. their size and depth. A ramped uniform random initialisation is described which straddles the ridge. With...
متن کاملSize Fair and Homologous Tree Genetic Programming Crossovers
Size fair and homologous crossover genetic operators for tree based genetic programming are described and tested. Both produce considerably reduced increases in program size and no detrimental e ect on GP performance. GP search spaces are partitioned by the ridge in the number of program v. their size and depth. A ramped uniform random initialisation is described which straddles the ridge. With...
متن کاملA Survey and Comparison of Tree Generation Algorithms
This paper discusses and compares five major tree-generation algorithms for genetic programming, and their effects on fitness: RAMPED HALF-AND-HALF, PTC1, PTC2, RANDOMBRANCH, and UNIFORM. The paper compares the performance of these algorithms on three genetic programming problems (11-Boolean Multiplexer, Artificial Ant, and Symbolic Regression), and discovers that the algorithms do not have a s...
متن کاملGrammar Bias and Initialisation in Grammar Based Genetic Programming
Preferential language biases which are introduced when using Tree-Adjoining Grammars in Grammatical Evolution affect the distribution of generated derivation structures, and as such, present difficulties when designing initialisation methods. Similar initial populations allow for a fairer comparison between different GP methods. This work proposes methods for dealing with these biases and exami...
متن کامل